115 research outputs found

    Partially-observed models for classifying minerals on Mars

    Get PDF
    The identification of phyllosilicates by NASA's CRISM (Compact Reconnaissance Imaging Spectrometer for Mars) strongly suggests the presence of water-related geological processes. A variety of water-bearing phyllosilicate minerals have already been identified by several research groups utilizing spectral enrichment techniques and matching phyllosilicate-rich regions on the Martian surface to known spectra of minerals found on earth. However, fully automated analysis of the CRISM data remains a challenge for two main reasons. First, there is significant variability in the spectral signature of the same mineral obtained from different regions on the Martian surface. Second, the list of mineral confirmed to date constituting the set of training classes is not exhaustive. Thus, when classifying new regions, using a classifier trained with selected minerals and chemicals, one must consider the potential presence of unknown materials not represented in the training library. We made an initial attempt to study these problems in the context of our recent work on partially-observed classification models and present results that show the utility of such models in identifying spectra of unknown minerals while simultaneously recognizing spectra of known minerals

    A Case Study for Massive Text Mining: K Nearest Neighbor Algorithm on PubMed data

    Get PDF
    poster abstractUS National Library of Medicine (NLM) has a huge collections of millions of books, journals, and other publications relating to medical domain. NLM creates the database called MEDLINE to store and link the citations to the publications. This database allows the researchers and students to access and find medical articles easily. The public can search on MEDLINE using a database called PubMed. When the new PubMed documents become available online, the curators have to manually decide the labels for them. The process is tedious and time-consuming because there are more than 27,149 descriptor (MeSH terms). Although the curators are already using a system called MTI for MeSH terms suggestion, the performance needs to be improved. This research explores the usage of text classification to annotate new PubMed document automatically, efficiently, and with reasonable accuracy. The data is gathered from BioASQ Contest, which contains 4 millions of abstracts. The research process includes preprocess the data, reduce the feature space, classify and evaluate the result. We focus on the K nearest neighbor algorithm in this case study

    Rare Jarosite Detection in CRISM Imagery by Non-Parametric Bayesian Clustering

    Get PDF
    Discovery of rare phases on Mars is important as they serve as indicators of the geochemistry of the Mars surface and facilitate understanding of mineral assemblages within a geologic unit. Identification of rare minerals in high spatial and spectral resolution Compact Reconnaissance Imaging Spectrometer for Mars (CRISM) visible/shortwave infrared (VSWIR) images has been a challenge due to the presence of both additive and multiplicative noise and other artifacts, affecting all collected images, in addition to the limited spatial extent of regions hosting these minerals. In an effort to automate this task we evaluate various clustering algorithms using the detection of rare jarosite, associated with spectrally similar minerals in CRISM imagery, as a case study. We compare nonparametric Bayesian and standard clustering algorithms and show that a recently developed doubly nonparametric Bayesian model could be effective for this task

    A coupled ETAS-I2GMM point process with applications to seismic fault detection

    Get PDF
    Epidemic-type aftershock sequence (ETAS) point process is a common model for the occurrence of earthquake events. The ETAS model consists of a stationary background Poisson process modeling spontaneous earthquakes and a triggering kernel representing the space–time-magnitude distribution of aftershocks. Popular nonparametric methods for estimation of the background intensity include histograms and kernel density estimators. While these methods are able to capture local spatial heterogeneity in the intensity of spontaneous events, they do not capture well patterns resulting from fault line structure over larger spatial scales. Here we propose a two-layer infinite Gaussian mixture model for clustering of earthquake events into fault-like groups over intermediate spatial scales. We introduce a Monte Carlo expectation-maximization (EM) algorithm for joint inference of the ETAS-I2GMM model and then apply the model to the Southern California Earthquake Catalog. We illustrate the advantages of the ETAS-I2GMM model in terms of both goodness of fit of the intensity and recovery of fault line clusters in the Community Fault Model 3.0 from earthquake occurrence data

    The Infinite Mixture of Infinite Gaussian Mixtures

    Get PDF
    Dirichlet process mixture of Gaussians (DPMG) has been used in the literature for clustering and density estimation problems. However, many real-world data exhibit cluster distributions that cannot be captured by a single Gaussian. Modeling such data sets by DPMG creates several extraneous clusters even when clusters are relatively well-defined. Herein, we present the infinite mixture of infinite Gaussian mixtures (I2GMM) for more flexible modeling of data sets with skewed and multi-modal cluster distributions. Instead of using a single Gaussian for each cluster as in the standard DPMG model, the generative model of I2GMM uses a single DPMG for each cluster. The individual DPMGs are linked together through centering of their base distributions at the atoms of a higher level DP prior. Inference is performed by a collapsed Gibbs sampler that also enables partial parallelization. Experimental results on several artificial and real-world data sets suggest the proposed I2GMM model can predict clusters more accurately than existing variational Bayes and Gibbs sampler versions of DPMG

    Machine-Learning-Driven New Geologic Discoveries at Mars Rover Landing Sites: Jezero Crater and NE Syrtis

    Get PDF
    A hierarchical Bayesian classifier is trained at pixel scale with spectral data from the CRISM (Compact Reconnaissance Imaging Spectrometer for Mars) images. Its utility in detecting small exposures of uncommon phases is demonstrated with new geologic discoveries near the Mars-2020 rover landing site. Akaganeite is found in sediments on the Jezero crater floor and in fluvial deposits at NE Syrtis. Jarosite and silica are found on the Jezero crater floor while chlorite-smectite and Al phyllosilicates are found in the Jezero crater walls. These detections point to a multi-stage, multi-chemistry history of water in Jezero crater and the surrounding region and provide new information for guiding the Mars-2020 rover's landed exploration. In particular, the akaganeite, silica, and jarosite in the floor deposits suggest either a later episode of salty, Fe-rich waters that post-date the Jezero crater delta or groundwater alteration of portions of the Jezero crater sedimentary sequence

    Machine-Learning-Driven New Geologic Discoveries at Mars Rover Landing Sites: Jezero and NE Syrtis

    Full text link
    A hierarchical Bayesian classifier is trained at pixel scale with spectral data from the CRISM (Compact Reconnaissance Imaging Spectrometer for Mars) imagery. Its utility in detecting rare phases is demonstrated with new geologic discoveries near the Mars-2020 rover landing site. Akaganeite is found in sediments on the Jezero crater floor and in fluvial deposits at NE Syrtis. Jarosite and silica are found on the Jezero crater floor while chlorite-smectite and Al phyllosilicates are found in the Jezero crater walls. These detections point to a multi-stage, multi-chemistry history of water in Jezero crater and the surrounding region and provide new information for guiding the Mars-2020 rover's landed exploration. In particular, the akaganeite, silica, and jarosite in the floor deposits suggest either a later episode of salty, Fe-rich waters that post-date Jezero delta or groundwater alteration of portions of the Jezero sedimentary sequence

    Machine-Learning-Driven New Geologic Discoveries at Mars Rover Landing Sites: Jezero Crater and NE Syrtis

    Get PDF
    A hierarchical Bayesian classifier is trained at pixel scale with spectral data from the CRISM (Compact Reconnaissance Imaging Spectrometer for Mars) images. Its utility in detecting small exposures of uncommon phases is demonstrated with new geologic discoveries near the Mars-2020 rover landing site. Akaganeite is found in sediments on the Jezero crater floor and in fluvial deposits at NE Syrtis. Jarosite and silica are found on the Jezero crater floor while chlorite-smectite and Al phyllosilicates are found in the Jezero crater walls. These detections point to a multi-stage, multi-chemistry history of water in Jezero crater and the surrounding region and provide new information for guiding the Mars-2020 rover's landed exploration. In particular, the akaganeite, silica, and jarosite in the floor deposits suggest either a later episode of salty, Fe-rich waters that post-date the Jezero crater delta or groundwater alteration of portions of the Jezero crater sedimentary sequence
    • …
    corecore